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Abstract 

Return words constitute a powerful tool for studying symbolic dynamical systems. They may 
be regarded as a discrete analogue of the first return map in dynamical systems. In this paper 
we investigate two abelian variants of the notion of return word, each of them gives rise to a 
new characterization of Sturmian words. We prove that a recurrent infinite word is Sturmian 
if and only if each of its factors has two or three abelian (or semi- abelian) returns. We study 
the structure of abelian returns in Sturmian words and give a characterization of those factors 
having exactly two abelian returns. Finally we discuss connections between abelian returns and 
periodicity in words. 
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1. Introduction 

Let w G AP" be an infinite word with values in a finite alphabet A. The (factor) complexity 
function p : N — )• N assigns to each n the number of distinct factors of w of length n. A 
fundamental result of Hedlund and Morse [15] states that a word w is ultimately periodic if and 
only if for some n the complexity p{n) < n. Infinite words of complexity p{n) = n + 1 are called 
Sturmian words. The most studied Sturmian word is the so-called Fibonacci word 

01001010010010100101001001010010 . . . 

fixed by the morphism i— )• 01 and 1 i— )• 0. In [16] Hedlund and Morse showed that each Stur- 
mian word may be realized geometrically by an irrational rotation on the circle. More precisely, 
every Sturmian word is obtained by coding the symbolic orbit of a point x on the circle (of 
circumference one) under a rotation by an irrational angle a where the circle is partitioned into 
two complementary intervals, one of length a and the other of length 1 — a. And conversely 
each such coding gives rise to a Sturmian word. The irrational a is called the slope of the Stur- 
mian word. An alternative characterization using continued fractions was given by Rauzy in 
|17j and [18] , and later by Arnoux and Rauzy in [2J . Sturmian words admit various other types 
of characterizations of geometric and combinatorial nature (see for instance [6j). For example 
they are characterized by the following balance property: A word w is Sturmian if and only if 
Ti; is a binary aperiodic (non-ultimately periodic) word and \ \u\i — \v\i\ < 1 for all factors u and 



Email addresses: svepuz@utu.fi (Svetlana Puzynina), zamboni@math.univ-lyonl.fr (Luca Q. Zamboni) 
^Partially supported by grant no. 251371 from the Academy of Finland, by Russian Foundation of Basic 
Research (grant 10-01-00424) and by RF President grant for young scientists (MK-4075.2012.1). 
^Partially supported by a grant from the Academy of Finland and by ANR grant SUBTILE. 



Preprint submitted to Elsevier 



V oiw of equal length, and for each letter i. Here \u\i denotes the number of occurrences of i in u. 

In this paper we develop and study two abelian analogues of the notion of return word and 
apply it to characterize Sturmian words. Return words constitute a powerful tool for studying 
various problems in combinatorics on words, symbolic dynamical systems and number theory. 
Given a factor v of an infinite word w, by a return word to v (in w) we mean a factor u of w such 
that uv is a factor of w beginning and ending in v and having no other (internal) occurrence 
oi V. In other words the set of all return words to v is the set of all distinct words beginning 
with an occurrence of v and ending just before the next occurrence of v. The notion of return 
words can be regarded as a discrete analogue of the first return map in dynamical systems. 
Many developments of the notion of return words have been given: For example, return words 
are used to characterize primitive substitutive sequences [H [10] • Return words are used in 
studying the transcendence of Sturmian or morphic continued fractions ^ . Return words were 
fruitfully studied in the context of interval exchange transformations (see [21] ) . Words having a 
constant number of return words were considered in [5]. In [9] a generalization of the notion of 
balanced property for Sturmian words was introduced and the proof is based on return words. 
Return words are also used to characterize periodicity and Sturmian words. The following 
characterization was obtained by L. Vuillon in |2Uj : 

Theorem 1. [20j A binary recurrent infinite word w is Sturmian if and only if each factor u of 
w has two returns in w. 



In [12j the proofs were simplified and return words were studied in the context of episturmian 
words. 

Two words are said to be abelian equivalent if they are permutations of each other. It is 
readily verified that this defines an equivalence relation on the set of all factors of an infinite 
word. Various abelian properties of words have been extensively investigated including abelian 
powers and their avoidance, abelian complexity and abelian periods [SI HI [3 [131 EH]- Given a 
factor u of an infinite word w, let ui < n2 < < . . . be all integers such that z/;„. . . . Wni+\u\-i 
is abelian equivalent u. Then we call each Wm ■ ■ ^ semi-abelian return to u. By an 

abelian return to u we mean an abelian class of Wm ■ ■ ■ Wm^i-i- We note that in both cases these 
definitions depend only on the abelian class of u. Each of these notions of abelian returns gives 
rise to a new characterization of Sturmian words: 

Theorem 2. A binary recurrent infinite word w is Sturmian if and only if each factor u of w 
has two or three abelian returns in w. 

Surprisingly, Sturmian words admit exactly the same characterization in terms of semi- 
abelian returns: 

Theorem 3. A binary recurrent infinite word w is Sturmian if and only if each factor u of w 
has two or three semi-abelian returns in w. 

Although the above characterizations of Sturmian words are similar to the one given in 
Theorem [T| our methods differ considerably from those used in |12| [20] . 

The paper is organized as follows: Section 2 is devoted to providing the necessary background 
and terminology relevant to the subsequent sections. In Section 3 we investigate connections 
between abelian returns and periodicity. In Section 4 we study the structure of abelian returns 
in Sturmian words. We prove that every factor of a Sturmian word has two or three abelian 
returns (Proposition 11) and moreover, a factor has two abelian returns if and only if it is 
singular (Theorem [16 ) . In Section 5 we prove the sufficiency of the condition on the number of 
abelian returns for a word to be Sturmian (Corollary [23[). In Section 6 we prove Theorem^ 
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2. Preliminaries 

2.1. Sturmian words and return words 

We begin by presenting some background on Sturmian words and return words and termi- 
nology which will be used later in the paper. 

Given a finite non-empty set S (called the alphabet), we denote by S* and S^, respectively, 
the set of finite words and the set of (right) infinite words over the alphabet S. A word u is a 
factor (resp. a prefix, resp. a suffix) of a word w, if there exist words x, y such that w = xvy 
(resp. w = vy, resp. w = xv). The set of factors of a finite or infinite word w is denoted by 
F{w). Given a finite word u = uiU2 ■ ■ - Un with n > 1 and tij G S, we denote the length n of u 
by \u\. The empty word will be denoted by e and we set \e\ = 0. For each a E S, we let \u\a 
denote the number of occurrences of the letter a in u. An infinite word w is said to be k-balanced 
if and only if \\u\a — \v\a\ < k for all factors u,v oi w of equal length and all letters a E S. If it; 
is 1-balanced, then we say that w is balanced. 

Two words u and v in S* are said to be abelian equivalent, denoted u '^^ if and only 
if \u\a = \v\a for all a E S. It is easy to see that abelian equivalence is indeed an equivalence 
relation on S*. 

We say that a (finite or infinite) word w is periodic, if there exists T such that Wn+T = Wn for 
every n. A right infinite word w is ultimately periodic if there exist T, no such that Wn+T = Wn 
for every n > uq. A word w is aperiodic, if it is not (ultimately) periodic. A factor u of li; is 
called right special if both ua and ub are factors of w for some pair of distinct letters a,b E S. 
Similarly u is called left special if both au and bu are factors of w for some pair of distinct letters 
a,b E S. The factor u is called bispecial if it is both right special and left special. 

Sturmian words can be defined in many different ways. For example, they are infinite words 
having the smallest factor complexity among aperiodic words. By a celebrated result due to 
Hedlund and Morse [T3] , a word is ultimately periodic if and only if its factor complexity p{n) 
is uniformly bounded. In particular, p[n) < n for all n sufficiently large. Sturmian words are 
exactly words whose factor complexity p{n) = n + 1 for all n > 0. Thus, Sturmian words are 
those aperiodic words having the lowest complexity. Since p{l) = 2, it follows that Sturmian 
words are binary words. In what follows, we denote the letters of a Sturmian word by and 1. 

The condition p{n) = n + 1 implies the existence of exactly one right special and one left 
special factor of each length. The set of factors of a Sturmian word is closed under reversal, 
so for every length the right special factor is a reversed left special factor, and bispecial factors 
are palindromes. Bispecial factors play a crucial role in Sturmian words. Standard factors of 
a Stumian word w are letters and factors of the form Bab, where a ^ b ^ {0, 1} and B is a 
bispecial factor of w. A factor of a Sturmian word is called singular if it is the only factor in 
its abelian class. It is well known that singular factors have the form aBa, where a is a letter 
and B a bispecial factor. We will also use the notion of Christoffel word. One of the ways to 
define Christoffel words is the following: they are factors of a Sturmian word of the form aBb 
and letters. 

In [16] it is shown that each Sturmian word may be realized measure-theoretically by an 
irrational rotation on the circle. That is, every Sturmian word is obtained by coding the symbolic 
orbit of a point x on the circle (of circumference one) under a rotation by an irrational angle a, 
< a < 1, where the circle is partitioned into two complementary intervals, one of length a and 
the other of length 1 — a. And conversely each such coding gives rise to a Sturmian word. The 
quantity a gives the frequency of letter 1 in the Sturmian word defined by such rotation. Other 
widely used characterizations are via mechanical words, cutting sequences, Sturmian morphisms 
etc., see p] for further detail. 
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Let w = wi'W2 ■ ■ ■ be an infinite word. The word w is recurrent if eacli of its factors occurs 
infinitely many times in w. In this case, for u G F{w), let ni < n2 < . . . be all integers rij such 
that u = ■ ■ .'Wjy^.j^\u\-i- Then the word Wm ■ ■ • Wn^+i-i is a return word (or briefly return) 
of u in w. An infinite word has k returns, if each of its factors has k returns. The following 
characterization of Sturmian words via return words was established in [20j : A word is Sturmian 
if and only if each of its factors has two returns (Theorem 1 in the Introduction). 

Also there exists a simple characterization of periodicity via return words: 

Proposition 4. [20j A recurrent infinite word is ultimately periodic if and only if there exists 
a factor having exactly one return word. 

2.2. Abelian and semi-abelian returns 

In this subsection we define the basic notions for the abelian case. In particular, we introduce 
two abelian versions of the notion of return word, abelian return and semi-abelian return. 

For an infinite recurrent word w and for u G F{w), let ni < n2 < < . . . be all integers rij 
such that ■ ■ ■ Wrij+|«|-i ~afe Then each Wm ■ ■ ■ Wn^^\u\-i is called a semi-abelian return to 
the abelian class of u. By an abelian return to the abelian class of u we mean an abelian class 
of Wm ■ ■ .Wrii+i-i- So the number of abelian returns is the number of distinct abelian classes 
of semi-abelian returns. Hence for every factor u in an infinite word w the number of abelian 
returns to the abelian class of u is less or equal to the number of semi-abelian returns to the 
abelian class of u. For brevity in the further text we often say (semi-)abelian return to factor 
u meaning the abelian class of u. We will often denote abelian returns by an element from the 
abelian equivalence class, that is by a semi-abelian return from the class. 

Example 5. Consider the Thue- Morse word 

t = 0110100110010110... 

fixed by the morphism fi: //(O) = 01, ;u(l) = 10. The abelian class of 01 consists of two words 
01 and 10. Consider an occurrence of 01 starting at position i, i.e., ti = 0, ti+i = 1. It can be 
followed by either or 10, i.e. we have either = or tj_|_2 = 1, tj+s = 0. In the first case we 
have = 10, which is abelian equivalent to 01, and hence we have the semi-abelian return 

ti = 0. In the second case tj+itj_|_2 = 11, which is not abelian equivalent to 01, so we consider the 
next factor ti+2ti+3 = 10 ~a6 01, which gives the semi-abelian return titi-^-i = 01. Symmetrically, 
10 gives semi-abelian returns 1 and 10. So the abelian class of 01 has four semi-abelian returns: 
{0, 1,01, 10} and three abelian returns since 01 ^ab 10. 

For our considerations we will use the following definitions. We say that a letter a is isolated 
in a word w E S'^, if aa is not a factor of w. A letter a £ T, appears in w in a block of length 
A; > 0, if a word ba^c is factor of w for some letters b ^ a, a ^ a. 

In this paper we establish a new characterization of Sturmian words analogous to Theorem[l} 
Namely, we prove that a recurrent infinite word is Sturmian if and only if each of its factors has 
two or three abelian returns (see Theorem[2]in the Introduction). On the other hand, contrary to 
property of being Sturmian, abelian returns do not give a simple characterization of periodicity 
analogous to Proposition |4j In terms of semi-abelian returns Sturmian words have exactly the 
same characterization as in terms of abelian returns (see Theorem ^ in Introduction) . 
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3. Abelian returns and periodicity 



In this section we discuss relations between periodicity and numbers of abelian and semi- 
abelian returns. We begin by proving a simple sufficient condition for periodicity: 

Lemma 6. Let |S| = k. If each factor of a recurrent infinite word over the alphabet S has at 
most k abelian returns, then the word is periodic. 

Proof. Let w he a, recurrent word over a A;- letter alphabet, and let f be a factor of w containing 
all letters from the alphabet. Consider two occurrences oi v in w, say in positions m and n (with 
m < n). Then the abelian class of Wm ■ • • Wn-i has all letters as abelian returns, and hence no 
more, because every factor of w must have at most k abelian returns. Thus w is periodic with 
period n — m. □ 

Remark. Actually, this proves something stronger: Let w be any aperiodic word over an 
alphabet S, |$]| = k, and let u be any factor of w containing k distinct letters, and let vu be 
any factor of w distinct from u beginning in u. Then the abelian class of v must have at least k 
abelian returns. It follows that if a word is not periodic, then for every positive integer N there 
exists an abelian factor of length > having at least A; + 1 abelian returns. In other words, the 
value k + 1 must be assumed infinitely often. 

Remark. Notice that the condition given by Lemma |6] is not necessary for periodicity. It is 
not difficult to construct a periodic word such that some of its factors have more than k abelian 
returns. 

Notice also that a characterization of periodicity similar to Proposition |4] in terms of abelian 
returns does not exist. Moreover, in the case of abelian returns it does not hold in both directions. 
Consider an infinite aperiodic word of the form {110010,110100}'^. It is easy to see that the 
factor 11 has one abelian return 110010 110100. So, the existence of a factor having one 
abelian return does not guarantee periodicity. The converse is not true as well: there exist 
periodic words such that each factor has at least two abelian returns. An example is given by 
the following word with period 24: 

w = (001101001011001100110011)'^. (1) 

To check that every factor of this word has at least two abelian returns, one can check the 
factors up to the length 12. If we denote the period of w by n, then every factor v of length 
12 < / < 24 has the same abelian returns as abelian class of words of length 24 — 1 obtained from 
u by deleting v. For a factor of length longer than 24 its abelian returns coincide with abelian 
returns of part of this factor obtained by shortening it by u. 

Now we continue with relations between semi-abelian returns and periodicity. In this con- 
nection semi-abelian returns show intermediate properties between normal and abelian returns. 
E. g., normal returns admit the characterization of periodicity given by Proposition |4j for abelian 
returns the proposition does not hold in both directions, and in the case of semi-abelian returns 
the proposition holds in one direction giving a sufficiency condition for periodicity: 

Proposition 7. If a recurrent infinite word has a factor with one semi-abelian return, then the 
word is periodic. 
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Proof. It is readily verified that this unique semi-abehan return word gives the period. □ 

We note that this condition is not necessary for periodicity. One can take the same example 
([T]) of a periodic word as for abelian returns. Since each of its factors has at least two abelian 
returns, it has at least two semi-abelian returns. 

Lemma [6] holds also for semi-abelian returns (exactly the same proof works): 

Lemma 8. Let |S| = k. If each factor of a recurrent infinite word over the alphabet S has at 
most k semi-abelian returns, then the word is periodic. 

4. The structure of abelian returns of Sturmian words 

In this section we prove the "only if" part of Theorem [2j and in addition we establish some 
properties concerning the structure of abelian returns of Sturmian words. 

The following proposition follows directly from definitions and basic properties of Sturmian 
words: 

Proposition 9. Semi-abelian returns of factors of a Sturmian word are Christoffel words. 

Proof. Consider semi-abelian return to a factor v of length n starting at position i of a Sturmian 
word w. We should prove that its semi-abelian return is either a letter or of the form aBb, 
where a ^ b are letters, i? is a bispecial factor of w. If Wi = Wi^n, then the letter Wi is semi- 
abelian return. If Wi = a, Wi-^n = b, a ^ b, then there exists k > 0, such that Wi^i . . . Wi-^k = 
Wi+i+n ■ ■ - Wi+k+n, and Wi+k+i / Wi+k+i+n- Since w is balanced, we have that Wi+k+i = b, 
Wi+k+i+n = a. So, Wi+k+2 ■ ■ ■ Wi+k+n+i ^ab V, and Wi... Wi+k+i ^ab Wi+n ■ ■ ■ Wi+k+n+1 IS scmi- 
abelian return to v. By definition the factor tfj+i . . . Wj+fc = Wj+i+n . . . Wi+k+n is bispecial. □ 

Corollary 10. Fix I > 2. Then each factor u of a Sturmian word has at most one abelian return 
of length I. 

Now we proceed to the "only if part of Theorem [2j 
Proposition 11. Each factor of a Sturmian word has two or three abelian returns. 

The proof of this proposition is based on the characterization of balanced words presented 
in [11]. We will need some notation from the paper. 

Suppose I < p < q are positive integers such that gcd{p,q) = 1. Let /^^g denote the set of 
all words w S {0, l}"^ with \w\i = p. If w G ^p,q then the symbol 1 occurs with frequency p/q in 
w. Define the shift a : {0, 1}'^ ^ {0, 1}'^ by a{w)^ = Wi+i. Similarly define a : {0, l}" {0, 1}" 
by a{wo . . . Wq-i) = wi . . . Wg^iWQ. 

Since gcd(p, g) = 1, it follows that any element of Wp^q has the least period q under the shift 
map a. We will write w ~ if there exists < k < q — 1 such that w' = a^{w). In this case 
we say that w, w' are cyclically conjugate, or that w, w' are cyclic shifts of one another. The 
equivalence class {cj*(u;) : < i < g} of each w G Wp^q contains exactly q elements. Let 

denote the corresponding quotient. Elements of Wp^g are called orbits. It will usually be conve- 
nient to denote an equivalence class in Wp^g by one of its elements w. 
Given an orbit [w] G Wp^g, let 

W(0) <L U'(i) <L ■■■ <L W(^q_i) 
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denote the lexicographic ordering of its elements. Define the lexicographic array A[w] of the orbit 
[w] to be the q x q matrix whose ith row is We will index this array by < z,j < g — 1, 
so that A[w] = {A[w]ij)f~j^Q. For < i, j < q — I, let denote the length-(j + 1) prefix of 

W(^iy, so the are the length-(j + 1) factors of w, counted with multiplicity. For each j this 

induces the following lexicographic ordering: 

W(o)[j] <L W{l)[j] <L-- - <L 

Theorem 12. [11] Suppose w E {0,1}'^. The following are equivalent: 

(1) w is a balanced word, 

(2) < \w(i+i)[j]\i for all < i < q - 2 and < j < q - I. 

The following proposition from [11] gives a very practical way of writing down the lexico- 
graphic array associated to a balanced word. 

Proposition 13. [llj Let [w] be the unique balanced orbit in Wp^q. Define u £ '^p,q by 

u = 0...01^_^ 

p 

Then, for < i, j < q — 1, 

(1) A[w]ij = {a^Pu)i, 

(2) The jth column of A[w] is (the vector transpose of) the word a^^u 

(3) w^i) = Ui{aPuUa^Pu)i . . . (^(''-iW)^. 

Example 14. Consider a balanced word w = 0101001 € Wp,q. The lexicographic ordering of 
[w\ is 

0010101 <L 0100101 <L 0101001 <L 0101010 <L 1001010 <L 1010010 <L 1010100, 
so the corresponding lexicographic array is 
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We now apply the above technique for studying abelian returns as follows: 
Fix a Sturmian word s and a factor v. First notice that v cannot have only one abelian 
return, otherwise we immediately get a contradiction with the irrationality of letter frequencies 
in s. We consider a standard factor w of s of long enough length to contain v and all abelian 
returns to v. Let = q, \w\i = p. Then all the conjugates of w are factors of s, they are 
pairwise distinct, and gcd(p, g) = 1 (see, e. g. [II]). Without loss of generality we can assume 
that V is "poor" in 1-s, i.e., it contains fewer I's than the unique other abelian class of the 
same length. Then if we consider in A[w\ the words we have that there exists n < q — 1 

such that '^ab V iov Q < i < n, and 9^a6 v ioi n < i < q — 1. Note also that 

^[^i^]im = ^[tt'](i+g_p)(m+i); from now on the indices are taken modulo q. 

The lexicographic array allows us to find abelian returns to v as follows: For a word u denote 
by n[m, /] the factor Um ■ ■ - Ui. If for an i, < i < n, we have w^i^[k, k + j] ^ab v, where v is 
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as above and /c > is the minimal such length, then by definition W(^i^ [k — 1] is a semi-abelian 
return to v. Notice also that if A[w]i^i_i)k = 1 a-^id = 0, then W(^m)[k,k + j] v for 

m = i, . . . ,i + n. That is, we have exactly n + 1 words from the abelian class of v starting 
in every column, and these words are in consecutive n + 1 rows (the first and the last row are 
considered as consecutive). 



Example 15. Consider abelian returns to the abelian class of 001 in the Example 14 w^^i-j [2] 
001 for < i < 4; W(i)[l,3] ~a6 001 for i = 4,5,6,0,1, 'W(j)[2,4] 001 for i = 1, . . . , 5. So, the 
abelian returns are W(o)[0] = 'W(i)[0] = 0, tL'(4)[0] = 1, W(2)[l] = 'W^(3)[l] = 01. 



Proof of Proposition 1 1 , Suppose that some factor v of length j + 1 has at least 4 abelian returns. 
Without loss of generality we may assume that v is poor in 1, and in the lexicographic array, 
rows . . . n start with factors from the abelian class of v. By Corollary |10| there can be at most 
one abelian return of a fixed length greater than 1 (length 1 will be considered separately), so 
in a lexicographic array we must have one of the following situations: 

1) there exist ki < k2 and ni < n2 < n such that Wi[j] has semi-abelian returns of length ki for 
i = 1, . . . ,ni, Wilj] has semi-abelian returns of length ^2 for i = m + 1, . . . , n2, and u)n2+i[j] has 
semi-abelian returns of length greater than k2; 

2) symmetric case: there exist ki < k2 and rii < n2 < n such that Wi[j] has semi-abelian returns 
of length k2 for i = ni + 1, . . . ,n2, Wi\j] has semi-abelian returns of length ki for z = 712 + 1, n, 
and has semi-abelian returns of length greater than /c2- 

We consider only case 1) as the proof of case 2) is similar. First, in case 1) one can notice 
that the words Wn-i [ki, ki +q] and [^2; ^2 + ?] coincide. So if we consider semi-abelian returns 
"to the left" of the words tUnJ/ci, ki + j] and Wn2[^2, ^2 + j]i they should be the same, but they 
are not: the first one is of length fei, the second one is of length k2- 

It remains to consider the case when v has both letters as abelian returns. It can be seen 
directly from the lexicographic array, that the third and the last return is 01 (in this case after 
a word not from abelian class of v we will necessarily have a word from abelian class of v, i.e., 
the longest possible length of abelian return is 2). □ 

Theorem 16. A factor of a Sturmian word has two abelian returns if and only if it is singular. 

Proof. The method of proof is similar to the proof of Proposition [TT] and relies upon the 
characterization of balanced words from |11] . 

If a factor is singular, then it is the only word in its abelian class, so its semi-abelian returns 
coincide with usual returns. Since every factor of a Sturmian word has two returns |20j , then a 
singular factor has two semi-abelian returns, and hence two abelian returns. 

Now we will prove the converse, i.e., that if a factor t> of a Sturmian word s of length j + 1 
has two abelian returns, then it is singular. 



As in the proof of Proposition 11 , we consider a standard factor w of s of long enough length 
to contain v and all abelian returns to v, and denote \w\ = q, {wli = p. Without loss of generality 
we again assume that v is "poor" in 1-s, so that there exists n < q — 1 such that ~ab 
for < i < n, and [j] ^ab v for n < i < q — 1. 

It is not difficult to see that two abelian returns are possible in one of the following cases: 

Case 1) there exist < m < n, < ki,k2 < q such that Wf^^^lki — 1] is semi-abelian return for 
all < i < m, [^2 — 1] is semi-abelian return for all m + 1 < i < n; 
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Case 2) there exist < nii < m2 < n, < ki < k2 < q such that W(j)[/ci — 1] is semi-abehan 
return for all < i < mi and m2 + 1 < i < n; W(^i'j[k2 — 1] is semi-abelian return for all 
nil + 1 < i < rn2- 

Case 1) In case 1) we will assume that ki < k2, the proof in case k2 < ki is symmetric. We 
will consider two subcases: 

Case la) A[w]mk2 — ^[M{m+i)k2 — 0- This means that u;(j)[/c2,fc2 + j] ~afe f for i = 
m + 1, . . . ,m + n + 1, and vlHm(fc2-i) = 0' ^M{m+i)(fc2-i) = 1- So, the element A\w\^j^x)k2 
is a left-upper element of a block of abelian class of u , and ^[w]m(fc2-i) * right-lower element 
of another such block. It is easy to see that the latter block starts in column ki. Therefore, 
\v\ = j ^\ = k2-ki< k2. 

In case la) we will prove that the abelian class of v consists of a single word, i.e., u'(j)[j] = v 
for i = 0, . . . , n. Suppose that for some i G {0, ... ,n — 1}. Since the rows 

grow lexicogaphically, it means that there exists < Z < j < ^2 — 1 such that j4[tt;]i; = 0, 
= 1- Hence A[w]i(^i+i) = 1, ^M(i+i)a+i) = 0. and so W(^i+i)[l + 1,1 + 1+ j] r^ab v. If 
m < i + 1 < n, then the word has return which is impossible, because it has 

return W(^i-j[k2]- Similarly we get that the case < i + 1 < m and / + 1 < fci is impossible. 

In case < i + 1 < m and /ci < i -|- 1 < ^2 we get that the word W(i^i)[ki, ki + j] has return 
W(^i^i)[ki, I] of length l — ki + 1. But in this case [/ + !, 1 + 1+ j] ^ab v for t = i + 1, . . . ,i + l+n. 
Contradiction with the condition that [k2 — 1] is semi-abelian return to w^^) [j] . So, the case 
< i + 1 < m and fci < / + 1 < ^2 is impossible. Hence [j] = 10(4+1) [j] for i = 0, . . . , n — 1, 
i.e., the abelian class of v consists of a single word. 

Case lb) ^[it;]mfc2 = or A['w](^jn_^_i-^j^2 ~ This means that [/c2; ^2 + j] ^ab v. Hence 
the word has semi-abelian return i(;(„)[/c2] of length A:2 -|- 1, and the word W(^m)[ki-, ki + j] 

has semi-abelian return W(^^^-^[ki, /C2] of length k2 — ki + 1, so the returns are different. This is 
impossible since t(;(„) = ty(m)[A;i, ki + q — 1]. 

Case 2) In case 2) the fact that is semi-abelian return for all < i < rrii — 1 and 

m2 + 1 < i < n implies that n > q/2. So, ki = 1, i.e., we necessarily have return(s) of 
length 1. Since there are two abelian returns totally, we can have only one return of length 
1, and this return is 0. It means that ^[lojio = for < i < n. Since W(^^2)[^ij + M i^ah v 
and W(m2+i)[l,j + 1] ~ab V, we have ^Hm2i = 1, ^N{m2+i)i = 0, and hence A[w]m20 = 0, 
^[^](m2+i)o = 1- We get a contradiction with A[t(;]jo = for < i < n. 

So, the converse is proved, i.e., every factor of a Sturmian word having two abelian returns 
is singular. □ 

5. Proof of Theorem [2| the sufficiency 

Here we prove the "if part of Theorem [2j i.e., we establish the condition on the number of 
abelian returns forcing a word to be Sturmian, i.e., we prove that a binary recurrent word with 
each factor having two or three abelian returns is Sturmian. 

Proposition 17. If each factor of a binary recurrent infinite word has at most three abelian 
returns and at least two semi-abelian returns, then the word is balanced. 

Notice that we formulate and prove auxiliary lemmas and propositions in a bit stronger way 
than we need for sufficiency in Theorem [2] instead the condition "each factor has two or three 
abelian returns" we put a weaker condition "each factor has at most three abelian and at least 
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two semi-abelian returns" . Using this condition we will be able to prove the sufficiency in both 
Theorems [2] and [3l since both words with two or three abelian returns and words with two or 
three semi-abelian returns satisfy this condition, we solve two problems at once. 

The proof of this proposition is rather technical, it is based on considering abelian returns to 
different possible factors of the infinite word and consecutively restricting the possible form of 
the word. Denote the binary word with at most three abelian returns hy w € {0, 1}'^. In the rest 
of this section instead of abelian returns "to the left" we consider abelian returns "to the right": 
if vu is a factor having v' ^ab v as its suffix, and vu does not contain as factors other words 
abelian equivalent to v besides suffix and prefix, then the abelian class of u is abelian return to 
the abelian class of v. It is easy to see that regardless of the definition, the set of abelian returns 
to each abelian factor is the same. We will refer to the word u as right semi-abelian return of the 
abelian class of v, so normal semi-abelian returns can be regarded as left semi-abelian returns. 
Right semi-abelian returns do not necessarily coincide with left semi-abelian returns, but their 
abelian classes also give the set of abelian returns. Though this does not make any essential 
difference, this modification of the definition is more convenient for our proof of this proposition. 

We will make use of the following key lemma: 

Lemma 18. // each factor of a binary recurrent infinite word w has at most three abelian and 
at least two semi-abelian returns, then one of the letters is isolated. 

Proof. Considering abelian returns to letters, we get that every letter can appear in blocks 
of at most three different lengths. Denote these lengths for blocks of O's by /i, I2, h, where 
h < h < h, for blocks of I's by ji, j2, js, where ji < j2 < ja- Notice that a letter can appear 
in blocks of only two or one lengths, then the third length or the third and the second lengths 
are missing. 

Consider right semi-abelian returns of the word lO'^: they are 1, 0'~'^1 for / = I2, I3 (if 
appears in blocks of corresponding lengths), l-^'^^O'^ for j = ji > 1, j2i Js (if 1 appears in blocks 
of corresponding lengths) and for ji = 1. Some of these returns should be missing or abelian 
equivalent to others in order to have at most three abelian returns totally. So we have the 
following cases: 

~ J2) js) ^3 are missing, i.e., w G {O'^l-'^, O'^l^i}'^. In this case abelian returns are 1, 0'2~'il, and 
lii-iQ'i for ji > 1 or for ji = 1. 

^ h, I3, is are missing, i.e., w G {0''^V^,0''^P'^}'^. Abelian returns are 1, P^^^o'^, and P^^-'^O'^ 
if ji > 1, or 0, if ji = 1. 

- 02, js are missing, ji = 2, I2 = 2li or ^3 = 2/i, i.e., w G ({O'^ , 0^'^ O'jP^)'^, Abehan returns 
are 1, O'^l, O'-'U. 

- ^3, j3 are missing, I2 = 2li, ji = 2 or ^2 = 2, u; G ({O'^, 0^'^}{1^, V})'^. Abelian returns are 1, 
O'll, l^-^O'i (if j > 1) or (if j = 1). 

~ j2) h, js) ^3 are missing, then w = (O'^l-')'^ is periodic. This case is impossible since O'^ has 
only one semi-abelian return. 

Notice that the first two cases are symmetric. Considering abelian returns to the word l-^^O, 
we get symmetric cases (0 change places with 1, jk change places with Ik, k = 1, 2, 3). Combining 
the cases obtained by considering abelian returns to lO'^ with the cases obtained by considering 
abelian returns to l-'^O, we finally get the following remaining cases (up to renaming letters): 

1) j2, js, I3 are missing, i.e. w is of the form w G {O'^l-'^, O'^l-'^}'^. 

2) /3, ia are missing, h = h h = 2, ji = 2, j2 = 4, i.e. w G ({0,02}{l2, l4})^. 

3) /3, J3 are missing, h = h h = 2, ji = 1, j2 = 2, i.e. w G ({0, 02}{1, 1^})-^ . 
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4) ^3, is are missing, h = 2, h = 4, ji = 2, = 4. i.e. w G ({O^.O^jjl^ l^})'^. 

Case 1): w G {O'^lj, O'nj}'^. 

In the first case we should prove that ji = 1. We omit the index 1 for brevity: j = ji. 
Suppose that j > 1. Consider right abelian returns to the word lO'^. They are 1, l-'~^(0'ip)'^0'2 
for all A; > such that the word 0'2p(0'ipyo'2 is a factor of w. Therefore, we have at most 
two values of k (probably, including 0) . 

Right abehan returns to the word PO'U are 1, (0'2 1-' )'"0'U for aU m > such that the word 
10'iP (0'2P)"*0'il is a factor of w. So, we have at most two values of m (probably, including 0). 

Notice that we cannot have only one value of k and only one value of m simultaneously, since 
in this case we have periodic word w = ((O'^ P)'^^ (O'^P)™-!)", and the word (^o^-ii^ -^mi-i^h j^^g 
only one semi-abelian return. 

Taking into account conditions for m and A;, which we have just obtained from considering 
abelian returns to both lO'^ and l-^O^^l, we find that there are two opportunities: 

Case la) w G ({(O'lp^i, (0'iF)''2}0'2p)^, < ki < fea- The word O'^PO'ip-i has returns 1, 
O'll, 0'2(P0'i)^-il for all k such that the word 0'nJ(0'ipyo'2 is a factor of w. To provide at 
most three abelian returns, w should admit only one value of A;. In this case there is also only 
one value of m, so the case la) is impossible. 

Case lb) w G (O'lP, {(O'^p)"*!, (O'n-')'"^})^, < mi < m2. The word PO'iPO^n ^as returns 
1, 10'2, 10'i(P0'2)'"-i for all m such that the word 10'ip(0'2p)™0'il is a factor of To provide 
at most three abelian returns, w should admit only one value of m. In this case there is also 
only one value of A;, so the case lb) is impossible. 

Thus, in case 1) I's are isolated. 

In cases 2)-4) we need to consider words containing all four blocks, otherwise we get into 
conditions of case 1) in which we proved that 1-s are isolated. The proof is similar for the 
three cases, and is based on studying abelian returns of certain type. When wc examine w G 
({0^1 , 0'^}, {l-'^ , l-'^})'^, we consider abelian returns to the words O'^l-'^ and O'^l-'^, and with a 
technical case study obtain that if both words have at most three abelian returns, then w is 
periodic of a certain form, and then find its factor having one semi-abelian return. 

Case 2): w G {{O'^ ,0^}{l,l^})^ . 

Consider abelian returns of the word 0^1^. Factors of w from the abelian class of 0^1^ are 
the following: 0^1^, 1^0^, 0110, 1001. Notice that each of these words is necessarily a factor of 
w. Consider right semi-abelian returns to each factor: 

• 0^1^, 01^0 have right semi-abelian return 

• 1^0^ has right semi-abelian returns of the form ai = (0^10^)*il and/or a2 = (0^10^)*20^1^ 
for some ii, ^2 > 

• 10^1 has right semi-abelian returns of the form as = (0^1)*^1 and/or 0:4 = (0^1)**0^1 for 
some ia , m > 

We will also use abelian returns of the word 0^1: 

• 0^1 could have right semi-abelian returns 0, returns of the forms a'l = (lO^l)-'^O^ with 
ji > and a'2 = (10^1)^2 10^ for some j2 > 

• 0^10, 010^ (not necessarily factors of w) have right semi-abelian return 
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• 0^10^ could have right semi-abehan returns 0, returns of the forms a'^ = (l^O^)-^^O^ with 
ja > and = {l^O^y^lO^ for some j4 > 

• 10^ has right semi-abelian return 1. 



These are summarized in the table below: 



abelian class 


word 


possible right semi-abelian returns 


02^2 


0^12, 01^0 







ai = (0^10^)^11, a2 = (0^10^)'20^1^ for some n,?2 > 


10^1 


as = {OHyn, ai = (0*1)^*0^1 for some ^3,^4 > 


0^1 


0^1 


0, a'l = (102l)Ji02 with ji > 0, = {l^i^iynQ^^ for some 32 > 


0^10, 010^ 





0^10^ 


0, a'^ = (120^)^30^ with js > 0, a'^ = (I'^^O'-^ynO'^ for some J4 > 


10' 


1 



Notice that ai when ii = is, and a'^ ^ab «3 when ji = J3. 

If factors from the abelian class of 0^1^ have only letters as abelian returns, then we obtain 
periodic word w = (O^l^)'^, and this word does not contain all four blocks. So, a factor from 
the abelian class of 0^1^ should contain an abelian return of length longer than 1 (referred to 
as long returns in the further text), so we consider the four cases corresponding to returns ai 
through a4. 

Case 2a) let 1^0^ have a return ai with ii > 0. Then w contains a factor u = l'^0'^{0'^10'^y^l. 
Now consider right semi-abelian returns to the abelian class of 0^1. One can find right semi- 
abelian returns (in the factor 0^10 of u) and 1 (in 10^1). Since u has a prefix 1^0^, it means 
that there is a long right semi-abelian return ending in 1^0^, i.e., we have right semi-abelian 
return a2 or a'^. A suffix 0^10^1 of u implies that there is a long right semi-abelian return 
ccg or 04. So, the only possibility is that an abelian class of 0^1 has abelian returns 0, 1 and 
0^3 ~a6 ct'i with ji = js > 0, and hence nothing else. The factor u has a suffix O^lO^l, so the 
factor O^IO^ here has right semi-abelian return a'^, and therefore u is continued in the unique 
way: u' = l'^0'^{0'^10'^y^ {I'^O'^y-^O'^ . One can find here two right semi-abelian returns and 1 
to the abelian class of 0^1'^, and we started with the ffist long right semi-abelian return ai, 
so the three returns to 0^1^ are 0, 1 and ai ^ab ck3- The factor u' has a suffix 1^0^, so the 
factor 1^0^ here has right semi-abelian return ai, therefore it is continued in the unique way: 
u" = 1^02(02102)*! (l2o2)j'3(o2lo2)*il. Continuing this line of reasoning, we obtain a periodic 
word. One can find a factor having one semi-abelian return, e. g., (l2o2).?3-ii2 jjg^^g i2q2 
no long right semi-abelian returns of the form a\ . 

Case 2b) let l2o2 have a return with > 0. Then w contains a factor u = 102l(0'^l)*31. Now 
consider right semi-abelian returns to the abelian class of 0^1. One can find right semi-abelian 
returns (in the factor 0^10 of u) and 1 (in 10^1). Since u has a prefix 102l02, it means that 
there is a long right semi-abelian return ending in 102l02, i.e., we have right semi-abelian return 
a[ or ct^. A suffix 0^l2 of u implies that there is a long right semi-abelian return a'^ or So, 
the only possibility is that an abelian class of 10*^ has abelian returns 0, 1 and a'l ^ab 0^3 with 
ji = js > 0. The factor u has a suffix 0^1^, so the factor 0^1 here has right semi-abelian return 
a'l, so u is continued in the unique way: u' = 

102l(o4i)i3(io2i)Jio2. This factor has a suffix 
102l02, so the factor 102l here has right semi-abelian return 03, and therefore it is continued in 
the unique way: u" = 102l(0'^l)*3(lo2l)ii(0'^l)*3l. Continuing this line of reasoning, we obtain 
a periodic word. One can find a factor having one semi-abelian return, e. g., (0'^1)*3~^0^. Hence 
102l has no long right semi-abelian returns of the form as. 
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Case 2c) let 1^0^ have a return with > 0. Notice that if 1^0^ has only return a2, then 
w = (1^0^(0^10^)'20^)'*', and w does not contain the block 0^. We proved that there is no long 
right semi-returns of the forms ai and as, so the only possibility is that 1^0^ has two returns 
a2 and 1, and 10^1 always has return 1, otherwise this abelian class has more than 3 abelian 
returns. So, l^O^ is followed by either (02l02)*202l2 or 1. In both cases we can determine 
several next letters: in the first case the next symbols are 00 (because w contains maximum two 
consecutive 1-s), in the second case the next symbols are 100 (since 10^1 always has return 1, 
and 11 is always followed by 00). So, 1^0^ is followed by cither [O'^W'^y^O'^l'^O'^ or l^O^. Both 
continuations have suffix 1^0^, which is followed by either 1 or 02, etc: 



Thus w e {(0^10^)*20^1^0^, 1^0^}"^. We are interested in the case when all four blocks are 
contained in w, so we get i2 > 0, otherwise w does not contain the block 1^, and we get into 
case 1), which we proved is impossible. 

So, w contains a factor u = 1^0^(0^10^)*^0^1^ for some ^2 > 0. Now consider abelian returns 
to the abelian class of 0^1. One can find right scmi-abelian returns (in the factor 0^10 of u) 
and 1 (in 10*^1). Since u has a prefix 1^0"^, it means that there is a long right semi-abelian return 
ending in 1^0^, i.e., we have right semi-abelian return 0:2 or a'^. A suffix 0^1^ of u implies that 
there is a long right semi-abelian return a\ or a'2. The only possibility is that an abelian class of 
1^0^ has abelian returns 0, 1 and 02 with j2 > 0, and nothing else. The set of abelian returns 0, 
1 and a'^ 0^3 is impossible since in this case the abelian class 1^0^ has other abelian returns 
than 0, 1, Q!2. The factor u has a suffix 0^1^, so the factor 0^1 here has right semi-abelian return 
02, so u is continued in the unique way: u = 1^0^(0^10^)*20^1(10^iy20^. This factor has a suffix 
10^10^, but we proved above that in the case 2c) the factor 10^1 is always followed by 1, so we 
get a contradiction. Hence 1^0^ has no returns of the form a2- 

Case 2d) let 10^1 have a return 014 with 14 > 0. Notice that if 10^1 has only return 04, then 
w = (0^1(0^1)**)'^, and w does not contain the block 1^. We proved that there is no long returns 
of the forms ai, 02 and ^3, so the only possibility is that 10^1 has two returns a4 and 1, and 
1^0^ always has return 1. So, 10^1 is followed by either (0^1)**0^1 or 1. In the second case we 
can determine several next letters to be 001 (because and 11 is always followed by 00, and 1^0^ 
always has return 1). So, 10^1 is followed by either (0^1)'*0^1 or 10^1. Both continuations have 
suffix 10^1, which is followed by either {OHy^OH or 1: 



Thus w € {{O^lf'^O^l, 10^1}'^. We are interested in the case when all four blocks are contained 
in so we get 14 > 0, otherwise w docs not contain the block 0^. 

Thus w contains a factor u = 10^ 1(0^1)*'* 0^1. Now consider abelian returns to the abelian 
class of 0^1. One can find right semi-abelian returns (in a factor O^IO^ of m) and 1 (in 10^1). 
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Since u has a prefix lO^lO^, we have a long right semi-abelian return ending in 10^10^, i.e., 
a'l or 0:4. A suffix 0^10^1 of u imphes that there is a long right semi-abehan return a'g or 
a'^ with j4 > 0. The only possibihty is that an abehan class of 0^1 has abelian returns 0, 1 
and 04 with j4 > 0. The set of abelian returns 0, 1 and a'l ^ab CK3 is impossible since is this 
case the abelian class of 0^1^ has other abelian returns than 0, 1 and 04. Considering the 
suffix 0^10^1 of n, we get that the factor 0^10^ here has right semi-abelian return 04, so u is 
continued in the unique way: u' = W'^l{OHY''0'^{l'^0'^y^lO'^ . The factor u' has a suffix 10^10^ 
so the factor 10^1 here has right semi-abelian return a^, so it is continued in the unique way: 
u" = 10'^l{0^iy*0'^{l'^0'^y^l{0^iy'^0'^l. Continuing this line of reasoning, we obtain a periodic 
word w. Its factor (0''l)*''~^0^ has only one semi-abelian return. Hence 10^1 has no long returns 
0:4. 

So, we are done with the case 2) 

Case 3): w G ({0, 02}{1, l^})^. 

Consider abelian returns for the word 0^1. Factors of w from the abelian class of 0^1 could 
be the following: 10^, 0^1, 010, and each of them necessarily appears in w. 

• 10^ has right semi-abelian return 1 

• 0^1 has right semi-abelian returns of the form ai = (lOl)'iO and/or 02 = {Wiy'^10^ for 
some ii, 12 > 0. 

• 010 has right semi-abelian returns of the form as = (110)*30 and/or 04 = (110)**10 for 
some 13, ii > 0. 

Symmetrically, we get possible abelian returns for 1^0: 

• 01^ has right semi-abelian return 

• 1^0 has right semi-abelian returns of the form a'^ = (010)-'^ 1 and/or 02 = (OlOj-'^Ol^ for 
some ji,j2 > 0. 

• 101 has right semi-abelian returns of the form ctg = (OOl)-'^l and/or 014 = (001)-^"*01 for 
some j3,j4 > 0. 



These are summarized in the table below: 



abelian class 


word 


possible right semi-abelian returns 


0^1 


10^ 


1 


0^1 


ai = (101)*iO, a2 = (101)*nO^ for some 11,12 > 


010 


aa = (110)^30, a4 = (llOj'nO for some 13,14 > 


1^0 


01^ 





1^0 


a[ = (OlOpil, a'2 = (010)-'201^ for some ji, ^2 > 


101 


= (001)^3 1, a'^ = (001)^401 for some js, j4 > 



Notice that ai c^s when h = 13, and a'l ^ab <^3 when ji = J3. In this case the lengths of 
blocks of O's and I's are the same, so we can use symmetry in the proofs. 

If factors from the abelian class of 0^1 have only letters as abelian returns, then w = (O^l)*^, 
and this word docs not contain all four blocks. So, a factor from the abelian class of 0^1 
should contain a long abelian return (of length longer than 1), so we consider the four cases 
corresponding to long returns 0:1-04. 
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Case 3a) let 0^1 have a return ai with ii > 0. Then w contains a factor u = 0^1(101)*iO. Now 
consider abeUan returns to the abehan class of 1^0. One can find right semi-abelian returns 1 
(in a factor 1101) and (in 0110). Since u has a prefix 001^, it means that there is a long right 
semi-abelian return ending in 0^1^, i. e., a'2 or a'^. A suffix 1010 of u implies that there is a 
long right semi-abelian return a'g or a'^. So, the only possibility is that an abelian class of 1^0 
has abelian returns 0, 1 and a'^ ~afe ot'i with ji = > 0. Considering the suffix 1010 of u, we 
get that the factor 101 here has right semi-abelian return a'^, so u is continued in the unique 
way: u = 02l(101)^i (001)^3 1. One can find in u' abelian returns and 1 to the abelian class 
of 0^1, and we started with the long return ai c^a- The factor u' has a suffix 0^1^, so the 
factor 001 here has right semi-abelian return ai, and hence u' is continued in the unique way: 
u" = 0^1(101)*! (OOl)-''' (101)*iO. Continuing this line of reasoning, wc obtain a periodic word, in 
which the abelian class of 1(101)*^ has one semi-abelian return. Hence 0^1 has no long returns 
ai, and symmetrically 1^0 has no long returns a'l- 

Case 3b) let 010 have a return 03 with ^3 > 0. Then w contains a factor u = 010(110)*^0. Now 
consider abelian returns to the abelian class of 1^0. One can find right semi-abelian returns 1 
(in a factor 1011) and (in 0110). Due to the prefix 0101 of u, there is a long right semi-abelian 
return ending in 0101, i.e., a'^ or a'^. The suffix 1100 of u implies that there is a long right 
semi-abelian return a'l or 0:2 • We proved that there are no long returns of the form a'^, so 1^0 
has right semi-abelian returns 0, 1, a'^, a'2- None of them are abelian equivalent, a contradiction. 
Hence 0^1 has no returns of the form 0:3, and symmetrically 1^0 has no returns a'^. 
Case 3c) let 0^1 have a return 02- The abelian class of 001 always has abelian return 1. 
If 0^1 has only return a2, then w = ((101)^^10^1)'^, and the factor 0^ has only one abelian 
return. So, 0^1 has also other abelian returns. Taking into account that there are no long 
returns of the forms ai and 0:3, and 0:2 is never abelian equivalent to 0:4, we get that there 
should be abelian return 0. Hence, there is no abelian return 04 and 010 is always followed 
by 0, 0^1 is followed by either or a2. So, w contains a factor u = 0^1(101)^^10^, ^2 > 0. 
Now consider abelian returns to the abelian class of 1^0. Since u has a prefix 0^1^, it means 
that there is a long right semi-abelian return ending in 0^1^, i.e., we have right semi-abelian 
return a'2 or 03. A suffix 1^0^ of u implies that there is a long right semi-abelian return a'l 
or a'2. We proved that we never have long return a'l, so we have right semi-abelian return a'2. 
Symmetrically to what we proved above, wc get that 101 is always followed by 1, 110 is followed 
by either 1 or a'2. So, the last occurrence of 110 in u is extended by a'2, i.e. we get the unique 
extention of u: u' = 0^1(101)*210(010)-'201^. Considering the last occurrence of the factor 001 
in u' , we get that it should have right semi-abelian return a2, i.e. we get the unique extention: 
u" = 0^1(101)*210(010)-'201(101)*210^. Continuing this line of reasoning, we get a periodic word, 
in which the factor 0(010)-'20 has only one semi-abelian return. Hence we have no returns of the 
form a2 and a'2. 

Case 3d) In the remaining case the word 010 has returns and 04 with ^4 > 0, and the word 
101 has returns 1 and a'j^ with ^4 > 0. So, w contains a factor u = 010(110)*" 10. Considering 
the last occurrence of 101 in u, we see that it has return a'^, so u is extended in the following 
way: 010(110)**1(001)-^*01. The last occurrence of 010 in this word necessarily has right semi- 
abelian return 04, so the word is extended uniquely as follows: 010(110)*'*1(001)-'''0(110)**'10. 
Continuing this line of reasoning, we get a periodic word. In this word > 0, otherwise we do 
not have occurrences of the block 1^, and the abelian class of (110)** 1 has only one semi-abelian 
return. 

So, we are done with the case 3) 
Case 4) u;G ({02,0^}{l2,l4})'^ 
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This case is considered in exactly the same way as the case 3) by considering abehan returns 
to O'*!^ and 0^1^. The only changes which should be done are doubling O's and I's everywhere 
except returns of length 1 (letters). □ 

Lemma 19. If w e {O'l 1, O'n}'^, < h < h is a recurrent word such that each of its factors 
has at most three abelian returns and at least two semi-abelian returns, then I2 = h + 1. 

Proof. Suppose that I2 > h + l- Consider abelian returns to the word O'^"*"^: it has right abelian 
returns and l(0'il)'=10'i+i for all /c > such that 0^2l(0''l)^0^^ is a factor of w, thus there 
could be at most two different values of k (probably, including 0). Consider abelian returns 
to the word lO'^lO: it has right abelian returns and (0'^~^10)-'0'^~^1 for all j > such that 
io'ii(o'nyo'ii is a factor of w, thus there could be at most two different values of j (probably, 
including 0). If we have only one value of k and one value of j simultaneously, then w is periodic, 
w = ((O'^l)'^^(O'^l)-'^)'^. In this periodic word if ki = 0, then the factor O2 has one semi-abelian 
return, if ki > 0, then the abelian class of l(0'il)'^i has only one semi-abelian return. So, we 
have two cases: 

Case I: w G {0'''^1{{0^^1)''^ , (O'-^l)''^})'^ , < ki < k2- In this case one can find four abelian returns 
to O'no'i-^: 0, 10'i-\ (10'l)'''"^10'2"^ (10'i)^2-iio'2-i. 

Case II: w G {0^n{{0^ny\ {O^ny^})"^ , O < ji < j2. in this case one can find four abelian returns 

to lo'no'no: 0, o'^-ii, (o'^-iiopi-^o'i-U, (o'^-iioy^-io'i-^i. □ 



The proofs of Lemma 18 and Lemma 19 immediately imply 



Corollary 20. // each factor of an infinite binary recurrent word w has at most three abelian 
returns and at least two semi-abelian returns, then w G {O'^l, O'^^^l}'^. 

Lemma 21. // each factor of a recurrent infinite binary word w has at most three abelian returns 
and at least two semi-abelian returns, then w is 2-balanced. 

Proof. For a length n, consider abelian classes of factors of length noiw. Denote by A the abelian 
class of factors containing the smallest number of 1-s: A = {u £ Fn{w) : \u\i = min^gp^(^) |^|i}- 
The next class we denote hy B: B = {u £ Fn{w) : = minj,gp'^(^) \v\i + 1}, the next one by 
C. If w has only two abelian classes, then it is Sturmian, so we are interested in the case when 
w has at least three abelian classes. For a length n, we associate to a word w a word ^('^^ over 
the alphabet of abelian classes of w of length n as follows: for an abelian class M of words of 
length n, ^^"^ = M iff tufc . . . Wk+n-i S M. In other words, ('^^"^)fc>o is the sequence of abelian 
classes of consecutive factors of length n m. w. 

It is easy to see that ^^""^ contains the following sequence of classes: CB^^A^'^B for some 
ji,j2 > 1, i-e. for some i we have . . . Ci+j-^^+j^+i — C B^^ A^'^ B . Then we have 

Wi = l,Wi+n = 0, 

Wk = Wk+n for k = i-\-l,...,i-\-ji-l, 

Wi+ji = l,Wi+j^+n = 0, 

Wk = Wk+n foT k = i+ + . . . ,i + ji+ j2, 

I. e., Wi... Wi+j-^+j, = lulvO, Wi+n ■ ■ ■ Wi+ii+i2+n = OuOvl. 



By Corollary 20 we have w G {O^U, O^i+H}^, so \u\ > 2/1 -|- 1; u contains both letters and 
1 and has a suffix O'^. It follows that j2 = 1. So, the class B has the following 3 abelian returns: 
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0, 1,01. All the returns are of length at most 2, so if after an occurrence of B we have C, then 
the next class is B again, otherwise we will get a longer return. So there are no other classes 
than these. In addition, we proved that if for length n there are three abelian classes, then in 
^('^^ letters A and C are isolated. □ 



Proof of Proposition 17. By Corollary 20 and Lemma 21, we have that w is 2-balanced and it 
is of the form {O'^l, O'^^^l}'^ for some integer li. Suppose that w is not balanced. Then there 
exists n for which there exist three classes of abelian equivalence in Fn{'w); as above, denote 



these classes by A, B and C. Arguing as in the proof of Lemma 21 , consider a sequence of classes 
BCB^AB which we necessarily have in ^^"^ for some integer j, denote its starting position by 
i — 1. Corresponding factor in w is 

Wi-i = 0,Wi-i+n = 1, 
Wi = l,Wi+n = 0, 

Wk = Wk+n for A; = i + 1, . . . i + j - 1, 

Wi+j = l,Wi+j+n = 0, 

Wi+j^i = 0,'u;j+j+i+„ = 1. 

1. e., Wi . . . Wi+j+i = InlO, Wi+n ■ ■ ■ "Wj+j+i+n = OnOl. Remark that u = lOj+i . . . ifj+j has prefix 
O'llO. 

Now consider abelian returns to an abelian class BO = Al of length n+1. The factor starting 
from the position i + 1 is of the form BO so it belongs to this class, and has an abelian return 0. 
The word starting from the position i + j is of the form BO and has an abelian return 1. The 
word starting from the position i + li — 1 belongs to this class, and has an abelian return 10. 
So we have at least three returns 0, 1 and 10. Now consider the occurrence of class -BO = Al to 
the left from the position i + 1. One can see that the positions i and i — 1 are from the class 
Bl = CO, so the preceding occurrence of BO = Al has an abelian return of length greater than 

2, which is a fourth return, though there should be at most three. So we cannot have more than 
two classes of abelian equivalence in a binary word having two or three abelian returns, i.e., such 



word should be balanced. Proposition 17 is proved. □ 



Lemma 22. Let w £ {0, 1}'^ be a recurrent balanced word. Then w is either Sturmian or 
periodic. In the latter case there exists a (possibly empty) bispecial factor B of u and a letter 
a £ {0, 1} such that aBa is a factor of w having exactly one first return in w. Since aBa is 
the unique element in its abelian class, it follows that if w is periodic then w contains a factor 
having only one semi-abelian return. 

Proof. Since w is assumed balanced, w contains at most one right special factor for each length 
n. If w is not Sturmian, then w is ultimately periodic, and hence periodic since it is recurrent. 
From here on we shall assume that w is periodic. Thus w has only a finite number of right 
special factors. As w is recurrent, the longest right special factor of w is also a bispecial factor of 
w. Let e = Bq, Bl, . . . , B^ denote the bispecial factors of w in order of increasing length. Thus 
Bn is also the longest right special factor of w. Set B = Bn-i. Then there exists a unique letter 
a E {0, 1} such that aB is a right special factor. In particular both aBa and bBa are factors of 
w where a ^ b £ {0,1}. We claim that the only right special factor of w which begins in Ba is 
Bn. Clearly, B is a right special factor beginning in Ba (since Ba is left special and hence must 
coincide with the prefix of Bn of its same length). To see that no other right special factor of 
vu begins in Ba, let R denote the shortest right special factor of w beginning in Ba. Then R is 
also left special and hence bispecial. It follows that R = Bn- Since Bn is also the longest right 
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special factor of w the claim is established. Having established the claim, it follows that aBa 
has a unique first return in w. If not, there would exist a right special factor beginning in aBa. 
From the previous claim it would follow that aBn is right special contradicting that Bn is the 
longest right special factor. □ 

We are now ready to prove the sufficiency condition: 

Corollary 23. If each factor of a binary recurrent infinite word has two or three ahelian returns, 
then the word is Sturmian. 



Proof. Follows from Proposition 17 and Lemma [221 □ 



Corollary 24. An aperiodic recurrent infinite word w is Sturmian if and only if each factor u 
of w has two or three ahelian returns in w. 

Proof. Lemma [6] implies that an aperiodic word with 2 or 3 abelian returns must necessarily be 
binary. □ 

6. Proof of Theorem [3] 

In this section we prove the characterization of Sturmian words in terms of semi-abelian 
returns. 

Proof of Theorem We have that for every factor in an infinite word the number of its semi- 



abelian returns is not less than the number of abelian returns. So, Proposition 17 and Lemma 



22 imply that if each factor of an infinite binary recurrent word has two or three semi-abelian 
returns, then the word is Sturmian. 

Now we prove that each factor of a Sturmian word has at most three semi-abelian returns. 
Suppose that a factor u of a Sturmian word has more than three semi-abelian returns. By 



Proposition 11 this factor has at most three abelian returns, so there are at least two semi- 
abelian returns which are abelian equivalent. Due to Proposition |9j semi-abelian returns to 
factors of Sturmian words are Christoffel words, i.e., letters or words of the form aBh, so if we 
have more than three semi-abelian returns to f , then there should be both returns 0-Bl and 
IBO. 

In the case \v\ > \0B1\ the return OBI is given by a factor OBlxlBO for some x G {0, 1}*, 
where OBlx is abelian equivalent to v. The return IBO is given by a factor IBOyOBl for some 
y S {0, 1}*, where IBOy is abelian equivalent to v. So, we have factors 1x1 and OyO, where x 
and y are abelian equivalent, a contradiction with balance. 

In the case 1 < |u| < |Oi?l| we have a factor z whose (intersecting) prefix and suffix are OBI 
and IBO, resp., and another factor z' of the same length whose prefix and suffix are 1-BO and 
OBI, resp. So B should have 1 and at the same position. 

If |f I = 1, i.e., V is a letter, it is easy to see that v has two semi-abelian returns. 

Thus, two different semi-abelian returns of the same length greater than 1 are impossible. 
This concludes the proof. □ 



Similarly to Corollary 24, we get 



Corollary 25. An aperiodic recurrent infinite word w is Sturmian if and only if each factor u 
of w has two or three semi-abelian returns in w. 
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